Goto

Collaborating Authors

 metr-la dataset




T-Graphormer: Using Transformers for Spatiotemporal Forecasting

Bai, Hao Yuan, Liu, Xue

arXiv.org Artificial Intelligence

Multivariate time series data is ubiquitous, and forecasting it has important applications in many domains. However, its complex spatial dependencies and non-linear temporal dynamics can be challenging for traditional techniques. Existing methods tackle these challenges by learning the two dimensions separately. Here, we introduce Temporal Graphormer (T-Graphormer), a Transformer-based approach capable of modelling spatiotemporal correlations simultaneously. By incorporating temporal dynamics in the Graphormer architecture, each node attends to all other nodes within the graph sequence. Our design enables the model to capture rich spatiotemporal patterns with minimal reliance on predefined spacetime inductive biases. We validate the effectiveness of T-Graphormer on real-world traffic prediction benchmark datasets. Compared to state-of-the-art methods, T-Graphormer reduces root mean squared error (RMSE) and mean absolute percentage error (MAPE) by up to 10%.


Temporal Graph MLP Mixer for Spatio-Temporal Forecasting

Bilal, Muhammad, Lopez, Luis Carretero

arXiv.org Artificial Intelligence

Spatiotemporal forecasting is critical in applications such as traffic prediction, climate modeling, and environmental monitoring. However, the prevalence of missing data in real-world sensor networks significantly complicates this task. In this paper, we introduce the Temporal Graph MLP-Mixer (T-GMM), a novel architecture designed to address these challenges. The model combines node-level processing with patch-level subgraph encoding to capture localized spatial dependencies while leveraging a three-dimensional MLP-Mixer to handle temporal, spatial, and feature-based dependencies. Experiments on the AQI, ENGRAD, PV-US and METR-LA datasets demonstrate the model's ability to effectively forecast even in the presence of significant missing data. While not surpassing state-of-the-art models in all scenarios, the T-GMM exhibits strong learning capabilities, particularly in capturing long-range dependencies. These results highlight its potential for robust, scalable spatiotemporal forecasting.


Memory-enhanced Invariant Prompt Learning for Urban Flow Prediction under Distribution Shifts

Jiang, Haiyang, Chen, Tong, Zhang, Wentao, Hung, Nguyen Quoc Viet, Yuan, Yuan, Li, Yong, Cui, Lizhen

arXiv.org Machine Learning

Urban flow prediction is a classic spatial-temporal forecasting task that estimates the amount of future traffic flow for a given location. Though models represented by Spatial-Temporal Graph Neural Networks (STGNNs) have established themselves as capable predictors, they tend to suffer from distribution shifts that are common with the urban flow data due to the dynamics and unpredictability of spatial-temporal events. Unfortunately, in spatial-temporal applications, the dynamic environments can hardly be quantified via a fixed number of parameters, whereas learning time- and location-specific environments can quickly become computationally prohibitive. In this paper, we propose a novel framework named Memory-enhanced Invariant Prompt learning (MIP) for urban flow prediction under constant distribution shifts. Specifically, MIP is equipped with a learnable memory bank that is trained to memorize the causal features within the spatial-temporal graph. By querying a trainable memory bank that stores the causal features, we adaptively extract invariant and variant prompts (i.e., patterns) for a given location at every time step. Then, instead of intervening the raw data based on simulated environments, we directly perform intervention on variant prompts across space and time. With the intervened variant prompts in place, we use invariant learning to minimize the variance of predictions, so as to ensure that the predictions are only made with invariant features. With extensive comparative experiments on two public urban flow datasets, we thoroughly demonstrate the robustness of MIP against OOD data.


A Comparative Study on Basic Elements of Deep Learning Models for Spatial-Temporal Traffic Forecasting

Shin, Yuyol, Yoon, Yoonjin

arXiv.org Machine Learning

Traffic forecasting plays a crucial role in intelligent transportation systems. The spatial-temporal complexities in transportation networks make the problem especially challenging. The recently suggested deep learning models share basic elements such as graph convolution, graph attention, recurrent units, and/or attention mechanism. In this study, we designed an in-depth comparative study for four deep neural network models utilizing different basic elements. For base models, one RNN-based model and one attention-based model were chosen from previous literature. Then, the spatial feature extraction layers in the models were substituted with graph convolution and graph attention. To analyze the performance of each element in various environments, we conducted experiments on four real-world datasets - highway speed, highway flow, urban speed from a homogeneous road link network, and urban speed from a heterogeneous road link network. The results demonstrate that the RNN-based model and the attention-based model show a similar level of performance for short-term prediction, and the attention-based model outperforms the RNN in longer-term predictions. The choice of graph convolution and graph attention makes a larger difference in the RNN-based models. Also, our modified version of GMAN shows comparable performance with the original with less memory consumption.


FC-GAGA: Fully Connected Gated Graph Architecture for Spatio-Temporal Traffic Forecasting

Oreshkin, Boris N., Amini, Arezou, Coyle, Lucy, Coates, Mark J.

arXiv.org Machine Learning

Forecasting of multivariate time-series is an important problem that has applications in many domains, including traffic management, cellular network configuration, and quantitative finance. In recent years, researchers have demonstrated the value of applying deep learning architectures for these problems. A special case of the problem arises when there is a graph available that captures the relationships between the time-series. In this paper we propose a novel learning architecture that achieves performance competitive with or better than the best existing algorithms, without requiring knowledge of the graph. The key elements of our proposed architecture are (i) jointly performing backcasting and forecasting with a deep fully-connected architecture; (ii) stacking multiple prediction modules that target successive residuals; and (iii) learning a separate causal relationship graph for each layer of the stack. We can view each layer as predicting a component of the time-series; the differing nature of the causal graphs at different layers can be interpreted as indicating that the multivariate predictive relationships differ for different components. Experimental results for two public traffic network datasets illustrate the value of our approach, and ablation studies confirm the importance of each element of the architecture.